Overview

Dataset statistics

Number of variables17
Number of observations2823
Missing cells1562
Missing cells (%)3.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 MiB
Average record size in memory701.7 B

Variable types

Numeric5
DateTime1
Categorical5
Text6

Dataset

DescriptionTest
AuthorTest
URL
Copyright(c) Test 2026

Alerts

DATA is highly overall correlated with QTR_ID and 1 other fieldsHigh correlation
QUANTITYORDERED is highly overall correlated with SALESHigh correlation
PRICEEACH is highly overall correlated with SALESHigh correlation
SALES is highly overall correlated with QUANTITYORDERED and 1 other fieldsHigh correlation
MONTH_ID is highly overall correlated with QTR_IDHigh correlation
QTR_ID is highly overall correlated with DATA and 1 other fieldsHigh correlation
YEAR_ID is highly overall correlated with DATAHigh correlation
STATE is highly overall correlated with COUNTRYHigh correlation
COUNTRY is highly overall correlated with STATEHigh correlation
STATE has 1486 (52.6%) missing valuesMissing
POSTALCODE has 76 (2.7%) missing valuesMissing

Reproduction

Analysis started2023-06-24 03:31:00.763116
Analysis finished2023-06-24 03:31:09.229303
Duration8.47 seconds
Software versionydata-profiling vv4.3.1
Download configurationconfig.json

Variables

DATA
Real number (ℝ)

HIGH CORRELATION 

Distinct307
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10258.725
Minimum10100
Maximum10425
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.2 KiB
2023-06-24T09:01:09.397673image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum10100
5-th percentile10115
Q110180
median10262
Q310333.5
95-th percentile10405
Maximum10425
Range325
Interquartile range (IQR)153.5

Descriptive statistics

Standard deviation92.085478
Coefficient of variation (CV)0.0089763081
Kurtosis-1.1733092
Mean10258.725
Median Absolute Deviation (MAD)79
Skewness0.013822989
Sum28960381
Variance8479.7352
MonotonicityNot monotonic
2023-06-24T09:01:09.620273image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10332 18
 
0.6%
10106 18
 
0.6%
10159 18
 
0.6%
10168 18
 
0.6%
10398 18
 
0.6%
10222 18
 
0.6%
10165 18
 
0.6%
10275 18
 
0.6%
10316 18
 
0.6%
10386 18
 
0.6%
Other values (297) 2643
93.6%
ValueCountFrequency (%)
10100 4
 
0.1%
10101 4
 
0.1%
10102 2
 
0.1%
10103 16
0.6%
10104 13
0.5%
10105 15
0.5%
10106 18
0.6%
10107 8
0.3%
10108 16
0.6%
10109 6
 
0.2%
ValueCountFrequency (%)
10425 13
0.5%
10424 6
0.2%
10423 5
 
0.2%
10422 2
 
0.1%
10421 2
 
0.1%
10420 13
0.5%
10419 14
0.5%
10417 6
0.2%
10416 14
0.5%
10415 5
 
0.2%

QUANTITYORDERED
Real number (ℝ)

HIGH CORRELATION 

Distinct58
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.092809
Minimum6
Maximum97
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.2 KiB
2023-06-24T09:01:09.812910image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile21
Q127
median35
Q343
95-th percentile49
Maximum97
Range91
Interquartile range (IQR)16

Descriptive statistics

Standard deviation9.7414427
Coefficient of variation (CV)0.27759085
Kurtosis0.41574379
Mean35.092809
Median Absolute Deviation (MAD)8
Skewness0.36258533
Sum99067
Variance94.895707
MonotonicityNot monotonic
2023-06-24T09:01:10.285769image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34 112
 
4.0%
21 103
 
3.6%
46 101
 
3.6%
27 100
 
3.5%
31 97
 
3.4%
41 97
 
3.4%
45 97
 
3.4%
26 96
 
3.4%
29 94
 
3.3%
48 94
 
3.3%
Other values (48) 1832
64.9%
ValueCountFrequency (%)
6 2
 
0.1%
10 2
 
0.1%
11 2
 
0.1%
12 1
 
< 0.1%
13 1
 
< 0.1%
15 4
 
0.1%
16 1
 
< 0.1%
18 3
 
0.1%
19 3
 
0.1%
20 93
3.3%
ValueCountFrequency (%)
97 1
 
< 0.1%
85 1
 
< 0.1%
77 1
 
< 0.1%
76 3
0.1%
70 2
 
0.1%
66 5
0.2%
65 1
 
< 0.1%
64 3
0.1%
62 1
 
< 0.1%
61 3
0.1%

PRICEEACH
Real number (ℝ)

HIGH CORRELATION 

Distinct1016
Distinct (%)36.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.658544
Minimum26.88
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.2 KiB
2023-06-24T09:01:10.444561image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum26.88
5-th percentile42.67
Q168.86
median95.7
Q3100
95-th percentile100
Maximum100
Range73.12
Interquartile range (IQR)31.14

Descriptive statistics

Standard deviation20.174277
Coefficient of variation (CV)0.24115022
Kurtosis-0.37481769
Mean83.658544
Median Absolute Deviation (MAD)4.3
Skewness-0.94664886
Sum236168.07
Variance407.00143
MonotonicityNot monotonic
2023-06-24T09:01:10.620430image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 1304
46.2%
59.87 6
 
0.2%
96.34 6
 
0.2%
57.73 5
 
0.2%
80.55 5
 
0.2%
90.17 5
 
0.2%
67.14 5
 
0.2%
61.99 5
 
0.2%
89.38 5
 
0.2%
51.93 5
 
0.2%
Other values (1006) 1472
52.1%
ValueCountFrequency (%)
26.88 1
 
< 0.1%
27.22 1
 
< 0.1%
28.29 1
 
< 0.1%
28.88 1
 
< 0.1%
29.21 2
0.1%
29.54 3
0.1%
29.7 1
 
< 0.1%
29.87 1
 
< 0.1%
30.06 2
0.1%
30.2 1
 
< 0.1%
ValueCountFrequency (%)
100 1304
46.2%
99.91 1
 
< 0.1%
99.82 2
 
0.1%
99.72 1
 
< 0.1%
99.69 1
 
< 0.1%
99.67 1
 
< 0.1%
99.66 1
 
< 0.1%
99.58 1
 
< 0.1%
99.57 1
 
< 0.1%
99.55 2
 
0.1%

SALES
Real number (ℝ)

HIGH CORRELATION 

Distinct2763
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3553.8891
Minimum482.13
Maximum14082.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.2 KiB
2023-06-24T09:01:10.830631image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum482.13
5-th percentile1268.757
Q12203.43
median3184.8
Q34508
95-th percentile7108.12
Maximum14082.8
Range13600.67
Interquartile range (IQR)2304.57

Descriptive statistics

Standard deviation1841.8651
Coefficient of variation (CV)0.51826747
Kurtosis1.7926765
Mean3553.8891
Median Absolute Deviation (MAD)1102.31
Skewness1.161076
Sum10032629
Variance3392467.1
MonotonicityNot monotonic
2023-06-24T09:01:11.010786image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3003 3
 
0.1%
5464.69 2
 
0.1%
2257.92 2
 
0.1%
5004.8 2
 
0.1%
2172.48 2
 
0.1%
4948.2 2
 
0.1%
2213.4 2
 
0.1%
2441.04 2
 
0.1%
3184.8 2
 
0.1%
1463 2
 
0.1%
Other values (2753) 2802
99.3%
ValueCountFrequency (%)
482.13 1
< 0.1%
541.14 1
< 0.1%
553.95 1
< 0.1%
577.6 1
< 0.1%
640.05 1
< 0.1%
651.8 1
< 0.1%
652.35 1
< 0.1%
683.8 1
< 0.1%
694.6 1
< 0.1%
703.6 1
< 0.1%
ValueCountFrequency (%)
14082.8 1
< 0.1%
12536.5 1
< 0.1%
12001 1
< 0.1%
11887.8 1
< 0.1%
11886.6 1
< 0.1%
11739.7 1
< 0.1%
11623.7 1
< 0.1%
11336.7 1
< 0.1%
11279.2 1
< 0.1%
10993.5 1
< 0.1%
Distinct252
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Memory size22.2 KiB
Minimum2003-01-06 00:00:00
Maximum2005-05-31 00:00:00
2023-06-24T09:01:11.267777image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:11.463823image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

QTR_ID
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size160.0 KiB
4
1094 
1
665 
2
561 
3
503 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2823
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row3
4th row3
5th row4

Common Values

ValueCountFrequency (%)
4 1094
38.8%
1 665
23.6%
2 561
19.9%
3 503
17.8%

Length

2023-06-24T09:01:11.672135image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-24T09:01:11.844501image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
4 1094
38.8%
1 665
23.6%
2 561
19.9%
3 503
17.8%

Most occurring characters

ValueCountFrequency (%)
4 1094
38.8%
1 665
23.6%
2 561
19.9%
3 503
17.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2823
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 1094
38.8%
1 665
23.6%
2 561
19.9%
3 503
17.8%

Most occurring scripts

ValueCountFrequency (%)
Common 2823
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 1094
38.8%
1 665
23.6%
2 561
19.9%
3 503
17.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2823
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 1094
38.8%
1 665
23.6%
2 561
19.9%
3 503
17.8%

MONTH_ID
Real number (ℝ)

HIGH CORRELATION 

Distinct12
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.0924548
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.2 KiB
2023-06-24T09:01:11.974783image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median8
Q311
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.6566333
Coefficient of variation (CV)0.51556667
Kurtosis-1.3832748
Mean7.0924548
Median Absolute Deviation (MAD)3
Skewness-0.27290156
Sum20022
Variance13.370967
MonotonicityNot monotonic
2023-06-24T09:01:12.086365image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
11 597
21.1%
10 317
11.2%
5 252
8.9%
1 229
 
8.1%
2 224
 
7.9%
3 212
 
7.5%
8 191
 
6.8%
12 180
 
6.4%
4 178
 
6.3%
9 171
 
6.1%
Other values (2) 272
9.6%
ValueCountFrequency (%)
1 229
8.1%
2 224
7.9%
3 212
7.5%
4 178
6.3%
5 252
8.9%
6 131
4.6%
7 141
5.0%
8 191
6.8%
9 171
6.1%
10 317
11.2%
ValueCountFrequency (%)
12 180
 
6.4%
11 597
21.1%
10 317
11.2%
9 171
 
6.1%
8 191
 
6.8%
7 141
 
5.0%
6 131
 
4.6%
5 252
8.9%
4 178
 
6.3%
3 212
 
7.5%

YEAR_ID
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size168.3 KiB
2004
1345 
2003
1000 
2005
478 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters11292
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2003
2nd row2003
3rd row2003
4th row2003
5th row2003

Common Values

ValueCountFrequency (%)
2004 1345
47.6%
2003 1000
35.4%
2005 478
 
16.9%

Length

2023-06-24T09:01:12.210491image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-24T09:01:12.354768image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
2004 1345
47.6%
2003 1000
35.4%
2005 478
 
16.9%

Most occurring characters

ValueCountFrequency (%)
0 5646
50.0%
2 2823
25.0%
4 1345
 
11.9%
3 1000
 
8.9%
5 478
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11292
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5646
50.0%
2 2823
25.0%
4 1345
 
11.9%
3 1000
 
8.9%
5 478
 
4.2%

Most occurring scripts

ValueCountFrequency (%)
Common 11292
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5646
50.0%
2 2823
25.0%
4 1345
 
11.9%
3 1000
 
8.9%
5 478
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11292
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5646
50.0%
2 2823
25.0%
4 1345
 
11.9%
3 1000
 
8.9%
5 478
 
4.2%

PRODUCTLINE
Categorical

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size187.4 KiB
Classic Cars
967 
Vintage Cars
607 
Motorcycles
331 
Planes
306 
Trucks and Buses
301 
Other values (2)
311 

Length

Max length16
Median length12
Mean length10.914984
Min length5

Characters and Unicode

Total characters30813
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMotorcycles
2nd rowMotorcycles
3rd rowMotorcycles
4th rowMotorcycles
5th rowMotorcycles

Common Values

ValueCountFrequency (%)
Classic Cars 967
34.3%
Vintage Cars 607
21.5%
Motorcycles 331
 
11.7%
Planes 306
 
10.8%
Trucks and Buses 301
 
10.7%
Ships 234
 
8.3%
Trains 77
 
2.7%

Length

2023-06-24T09:01:12.488968image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-24T09:01:12.666644image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
cars 1574
31.5%
classic 967
19.3%
vintage 607
 
12.1%
motorcycles 331
 
6.6%
planes 306
 
6.1%
trucks 301
 
6.0%
and 301
 
6.0%
buses 301
 
6.0%
ships 234
 
4.7%
trains 77
 
1.5%

Most occurring characters

ValueCountFrequency (%)
s 5359
17.4%
a 3832
12.4%
C 2541
 
8.2%
r 2283
 
7.4%
2176
 
7.1%
c 1930
 
6.3%
i 1885
 
6.1%
l 1604
 
5.2%
e 1545
 
5.0%
n 1291
 
4.2%
Other values (15) 6367
20.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23939
77.7%
Uppercase Letter 4698
 
15.2%
Space Separator 2176
 
7.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 5359
22.4%
a 3832
16.0%
r 2283
9.5%
c 1930
 
8.1%
i 1885
 
7.9%
l 1604
 
6.7%
e 1545
 
6.5%
n 1291
 
5.4%
t 938
 
3.9%
o 662
 
2.8%
Other values (7) 2610
10.9%
Uppercase Letter
ValueCountFrequency (%)
C 2541
54.1%
V 607
 
12.9%
T 378
 
8.0%
M 331
 
7.0%
P 306
 
6.5%
B 301
 
6.4%
S 234
 
5.0%
Space Separator
ValueCountFrequency (%)
2176
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 28637
92.9%
Common 2176
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 5359
18.7%
a 3832
13.4%
C 2541
8.9%
r 2283
8.0%
c 1930
 
6.7%
i 1885
 
6.6%
l 1604
 
5.6%
e 1545
 
5.4%
n 1291
 
4.5%
t 938
 
3.3%
Other values (14) 5429
19.0%
Common
ValueCountFrequency (%)
2176
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30813
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 5359
17.4%
a 3832
12.4%
C 2541
 
8.2%
r 2283
 
7.4%
2176
 
7.1%
c 1930
 
6.3%
i 1885
 
6.1%
l 1604
 
5.2%
e 1545
 
5.0%
n 1291
 
4.2%
Other values (15) 6367
20.7%

PHONE
Text

Distinct91
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size189.3 KiB
2023-06-24T09:01:12.968182image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length17
Median length10
Mean length11.636557
Min length9

Characters and Unicode

Total characters32850
Distinct characters16
Distinct categories7 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2125557818
2nd row26.47.1555
3rd row+33 1 46 62 7555
4th row6265557265
5th row6505551386
ValueCountFrequency (%)
555 375
 
6.7%
91 291
 
5.2%
94 259
 
4.6%
44 259
 
4.6%
4155551450 180
 
3.2%
8555 122
 
2.2%
171 118
 
2.1%
3555 82
 
1.5%
65 79
 
1.4%
4555 78
 
1.4%
Other values (127) 3750
67.0%
2023-06-24T09:01:13.524795image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 10957
33.4%
2770
 
8.4%
4 2554
 
7.8%
1 2528
 
7.7%
2 2161
 
6.6%
9 1685
 
5.1%
6 1685
 
5.1%
0 1623
 
4.9%
8 1476
 
4.5%
3 1285
 
3.9%
Other values (6) 4126
 
12.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 27165
82.7%
Space Separator 2770
 
8.4%
Dash Punctuation 704
 
2.1%
Open Punctuation 626
 
1.9%
Close Punctuation 626
 
1.9%
Other Punctuation 588
 
1.8%
Math Symbol 371
 
1.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 10957
40.3%
4 2554
 
9.4%
1 2528
 
9.3%
2 2161
 
8.0%
9 1685
 
6.2%
6 1685
 
6.2%
0 1623
 
6.0%
8 1476
 
5.4%
3 1285
 
4.7%
7 1211
 
4.5%
Space Separator
ValueCountFrequency (%)
2770
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 704
100.0%
Open Punctuation
ValueCountFrequency (%)
( 626
100.0%
Close Punctuation
ValueCountFrequency (%)
) 626
100.0%
Other Punctuation
ValueCountFrequency (%)
. 588
100.0%
Math Symbol
ValueCountFrequency (%)
+ 371
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 32850
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 10957
33.4%
2770
 
8.4%
4 2554
 
7.8%
1 2528
 
7.7%
2 2161
 
6.6%
9 1685
 
5.1%
6 1685
 
5.1%
0 1623
 
4.9%
8 1476
 
4.5%
3 1285
 
3.9%
Other values (6) 4126
 
12.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32850
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 10957
33.4%
2770
 
8.4%
4 2554
 
7.8%
1 2528
 
7.7%
2 2161
 
6.6%
9 1685
 
5.1%
6 1685
 
5.1%
0 1623
 
4.9%
8 1476
 
4.5%
3 1285
 
3.9%
Other values (6) 4126
 
12.6%
Distinct92
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size212.7 KiB
2023-06-24T09:01:13.809890image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length42
Median length36
Mean length19.445979
Min length11

Characters and Unicode

Total characters54896
Distinct characters67
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row897 Long Airport Avenue
2nd row59 rue de l'Abbaye
3rd row27 rue du Colonel Pierre Avia
4th row78934 Hillside Dr.
5th row7734 Strong St.
ValueCountFrequency (%)
st 442
 
4.6%
c 306
 
3.2%
rue 281
 
2.9%
moralzarzal 259
 
2.7%
86 259
 
2.7%
strong 250
 
2.6%
street 216
 
2.2%
5677 180
 
1.9%
furth 135
 
1.4%
circle 135
 
1.4%
Other values (210) 7216
74.6%
2023-06-24T09:01:14.237971image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6904
 
12.6%
e 3914
 
7.1%
r 3579
 
6.5%
a 2883
 
5.3%
t 2545
 
4.6%
n 2485
 
4.5%
o 2437
 
4.4%
l 1979
 
3.6%
i 1901
 
3.5%
u 1438
 
2.6%
Other values (57) 24831
45.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 30949
56.4%
Decimal Number 8168
 
14.9%
Space Separator 6904
 
12.6%
Uppercase Letter 6337
 
11.5%
Other Punctuation 2226
 
4.1%
Dash Punctuation 270
 
0.5%
Currency Symbol 23
 
< 0.1%
Control 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3914
12.6%
r 3579
11.6%
a 2883
9.3%
t 2545
 
8.2%
n 2485
 
8.0%
o 2437
 
7.9%
l 1979
 
6.4%
i 1901
 
6.1%
u 1438
 
4.6%
s 1187
 
3.8%
Other values (15) 6601
21.3%
Uppercase Letter
ValueCountFrequency (%)
S 1303
20.6%
C 826
13.0%
M 586
9.2%
A 464
 
7.3%
B 379
 
6.0%
L 357
 
5.6%
P 310
 
4.9%
D 305
 
4.8%
R 268
 
4.2%
F 253
 
4.0%
Other values (12) 1286
20.3%
Decimal Number
ValueCountFrequency (%)
7 1132
13.9%
6 1129
13.8%
2 983
12.0%
5 918
11.2%
8 872
10.7%
3 871
10.7%
4 790
9.7%
1 667
8.2%
0 427
 
5.2%
9 379
 
4.6%
Other Punctuation
ValueCountFrequency (%)
. 910
40.9%
, 795
35.7%
/ 349
 
15.7%
' 108
 
4.9%
? 38
 
1.7%
# 26
 
1.2%
Space Separator
ValueCountFrequency (%)
6904
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 270
100.0%
Currency Symbol
ValueCountFrequency (%)
¤ 23
100.0%
Control
ValueCountFrequency (%)
„ 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37286
67.9%
Common 17610
32.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3914
 
10.5%
r 3579
 
9.6%
a 2883
 
7.7%
t 2545
 
6.8%
n 2485
 
6.7%
o 2437
 
6.5%
l 1979
 
5.3%
i 1901
 
5.1%
u 1438
 
3.9%
S 1303
 
3.5%
Other values (37) 12822
34.4%
Common
ValueCountFrequency (%)
6904
39.2%
7 1132
 
6.4%
6 1129
 
6.4%
2 983
 
5.6%
5 918
 
5.2%
. 910
 
5.2%
8 872
 
5.0%
3 871
 
4.9%
, 795
 
4.5%
4 790
 
4.5%
Other values (10) 2306
 
13.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54854
99.9%
None 42
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6904
 
12.6%
e 3914
 
7.1%
r 3579
 
6.5%
a 2883
 
5.3%
t 2545
 
4.6%
n 2485
 
4.5%
o 2437
 
4.4%
l 1979
 
3.6%
i 1901
 
3.5%
u 1438
 
2.6%
Other values (55) 24789
45.2%
None
ValueCountFrequency (%)
¤ 23
54.8%
„ 19
45.2%

CITY
Text

Distinct73
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size178.6 KiB
2023-06-24T09:01:14.481308image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length14
Median length12
Mean length7.7530995
Min length3

Characters and Unicode

Total characters21887
Distinct characters47
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNYC
2nd rowReims
3rd rowParis
4th rowPasadena
5th rowSan Francisco
ValueCountFrequency (%)
san 307
 
9.0%
madrid 304
 
8.9%
rafael 180
 
5.3%
nyc 152
 
4.4%
singapore 79
 
2.3%
new 78
 
2.3%
paris 70
 
2.0%
francisco 62
 
1.8%
bedford 61
 
1.8%
nantes 60
 
1.8%
Other values (72) 2073
60.5%
2023-06-24T09:01:14.899006image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2614
 
11.9%
e 2008
 
9.2%
n 1562
 
7.1%
r 1501
 
6.9%
i 1327
 
6.1%
o 1298
 
5.9%
l 1083
 
4.9%
s 1049
 
4.8%
d 1019
 
4.7%
603
 
2.8%
Other values (37) 7823
35.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17522
80.1%
Uppercase Letter 3730
 
17.0%
Space Separator 603
 
2.8%
Dash Punctuation 32
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2614
14.9%
e 2008
11.5%
n 1562
8.9%
r 1501
8.6%
i 1327
 
7.6%
o 1298
 
7.4%
l 1083
 
6.2%
s 1049
 
6.0%
d 1019
 
5.8%
t 523
 
3.0%
Other values (14) 3538
20.2%
Uppercase Letter
ValueCountFrequency (%)
S 553
14.8%
M 529
14.2%
B 417
11.2%
N 391
10.5%
C 296
7.9%
R 260
 
7.0%
L 190
 
5.1%
P 170
 
4.6%
Y 152
 
4.1%
G 91
 
2.4%
Other values (11) 681
18.3%
Space Separator
ValueCountFrequency (%)
603
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 32
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21252
97.1%
Common 635
 
2.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2614
 
12.3%
e 2008
 
9.4%
n 1562
 
7.3%
r 1501
 
7.1%
i 1327
 
6.2%
o 1298
 
6.1%
l 1083
 
5.1%
s 1049
 
4.9%
d 1019
 
4.8%
S 553
 
2.6%
Other values (35) 7238
34.1%
Common
ValueCountFrequency (%)
603
95.0%
- 32
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21887
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2614
 
11.9%
e 2008
 
9.2%
n 1562
 
7.1%
r 1501
 
6.9%
i 1327
 
6.1%
o 1298
 
5.9%
l 1083
 
4.9%
s 1049
 
4.8%
d 1019
 
4.7%
603
 
2.8%
Other values (37) 7823
35.7%

STATE
Categorical

HIGH CORRELATION  MISSING 

Distinct16
Distinct (%)1.2%
Missing1486
Missing (%)52.6%
Memory size124.8 KiB
CA
416 
MA
190 
NY
178 
NSW
92 
Victoria
78 
Other values (11)
383 

Length

Max length13
Median length2
Mean length2.9050112
Min length2

Characters and Unicode

Total characters3884
Distinct characters35
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNY
2nd rowCA
3rd rowCA
4th rowCA
5th rowCA

Common Values

ValueCountFrequency (%)
CA 416
 
14.7%
MA 190
 
6.7%
NY 178
 
6.3%
NSW 92
 
3.3%
Victoria 78
 
2.8%
PA 75
 
2.7%
CT 61
 
2.2%
BC 48
 
1.7%
NH 34
 
1.2%
Tokyo 32
 
1.1%
Other values (6) 133
 
4.7%
(Missing) 1486
52.6%

Length

2023-06-24T09:01:15.048029image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca 416
29.9%
ma 190
13.7%
ny 178
12.8%
nsw 92
 
6.6%
victoria 78
 
5.6%
pa 75
 
5.4%
ct 61
 
4.4%
bc 48
 
3.5%
nh 34
 
2.4%
tokyo 32
 
2.3%
Other values (8) 185
13.3%

Most occurring characters

ValueCountFrequency (%)
A 681
17.5%
C 525
13.5%
N 354
 
9.1%
M 190
 
4.9%
i 182
 
4.7%
Y 178
 
4.6%
o 168
 
4.3%
a 133
 
3.4%
W 118
 
3.0%
V 107
 
2.8%
Other values (25) 1248
32.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2599
66.9%
Lowercase Letter 1233
31.7%
Space Separator 52
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 182
14.8%
o 168
13.6%
a 133
10.8%
t 104
8.4%
e 100
8.1%
c 100
8.1%
r 78
 
6.3%
s 61
 
4.9%
k 52
 
4.2%
l 41
 
3.3%
Other values (8) 214
17.4%
Uppercase Letter
ValueCountFrequency (%)
A 681
26.2%
C 525
20.2%
N 354
13.6%
M 190
 
7.3%
Y 178
 
6.8%
W 118
 
4.5%
V 107
 
4.1%
T 93
 
3.6%
S 92
 
3.5%
P 75
 
2.9%
Other values (6) 186
 
7.2%
Space Separator
ValueCountFrequency (%)
52
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3832
98.7%
Common 52
 
1.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 681
17.8%
C 525
13.7%
N 354
 
9.2%
M 190
 
5.0%
i 182
 
4.7%
Y 178
 
4.6%
o 168
 
4.4%
a 133
 
3.5%
W 118
 
3.1%
V 107
 
2.8%
Other values (24) 1196
31.2%
Common
ValueCountFrequency (%)
52
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3884
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 681
17.5%
C 525
13.5%
N 354
 
9.1%
M 190
 
4.9%
i 182
 
4.7%
Y 178
 
4.6%
o 168
 
4.3%
a 133
 
3.4%
W 118
 
3.0%
V 107
 
2.8%
Other values (25) 1248
32.1%

POSTALCODE
Text

MISSING 

Distinct73
Distinct (%)2.7%
Missing76
Missing (%)2.7%
Memory size169.4 KiB
2023-06-24T09:01:15.287892image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length9
Median length5
Mean length5.2133236
Min length1

Characters and Unicode

Total characters14321
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10022
2nd row51100
3rd row75508
4th row90003
5th row94217
ValueCountFrequency (%)
28034 259
 
8.4%
97562 205
 
6.6%
10022 152
 
4.9%
94217 89
 
2.9%
50553 61
 
2.0%
44000 60
 
1.9%
3004 55
 
1.8%
n 53
 
1.7%
ec2 51
 
1.6%
5nt 51
 
1.6%
Other values (75) 2061
66.5%
2023-06-24T09:01:15.700173image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3173
22.2%
2 1890
13.2%
1 1434
10.0%
4 1044
 
7.3%
3 1035
 
7.2%
5 990
 
6.9%
7 947
 
6.6%
9 763
 
5.3%
6 740
 
5.2%
8 712
 
5.0%
Other values (22) 1593
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12728
88.9%
Uppercase Letter 1071
 
7.5%
Space Separator 350
 
2.4%
Dash Punctuation 172
 
1.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 134
12.5%
T 106
9.9%
F 104
9.7%
W 93
 
8.7%
M 78
 
7.3%
C 73
 
6.8%
P 64
 
6.0%
S 57
 
5.3%
X 55
 
5.1%
E 51
 
4.8%
Other values (10) 256
23.9%
Decimal Number
ValueCountFrequency (%)
0 3173
24.9%
2 1890
14.8%
1 1434
11.3%
4 1044
 
8.2%
3 1035
 
8.1%
5 990
 
7.8%
7 947
 
7.4%
9 763
 
6.0%
6 740
 
5.8%
8 712
 
5.6%
Space Separator
ValueCountFrequency (%)
350
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 172
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13250
92.5%
Latin 1071
 
7.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 134
12.5%
T 106
9.9%
F 104
9.7%
W 93
 
8.7%
M 78
 
7.3%
C 73
 
6.8%
P 64
 
6.0%
S 57
 
5.3%
X 55
 
5.1%
E 51
 
4.8%
Other values (10) 256
23.9%
Common
ValueCountFrequency (%)
0 3173
23.9%
2 1890
14.3%
1 1434
10.8%
4 1044
 
7.9%
3 1035
 
7.8%
5 990
 
7.5%
7 947
 
7.1%
9 763
 
5.8%
6 740
 
5.6%
8 712
 
5.4%
Other values (2) 522
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14321
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3173
22.2%
2 1890
13.2%
1 1434
10.0%
4 1044
 
7.3%
3 1035
 
7.2%
5 990
 
6.9%
7 947
 
6.6%
9 763
 
5.3%
6 740
 
5.2%
8 712
 
5.0%
Other values (22) 1593
11.1%

COUNTRY
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size171.2 KiB
USA
1004 
Spain
342 
France
314 
Australia
185 
UK
144 
Other values (14)
834 

Length

Max length11
Median length9
Mean length5.0446334
Min length2

Characters and Unicode

Total characters14241
Distinct characters33
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSA
2nd rowFrance
3rd rowFrance
4th rowUSA
5th rowUSA

Common Values

ValueCountFrequency (%)
USA 1004
35.6%
Spain 342
 
12.1%
France 314
 
11.1%
Australia 185
 
6.6%
UK 144
 
5.1%
Italy 113
 
4.0%
Finland 92
 
3.3%
Norway 85
 
3.0%
Singapore 79
 
2.8%
Canada 70
 
2.5%
Other values (9) 395
 
14.0%

Length

2023-06-24T09:01:15.880839image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
usa 1004
35.6%
spain 342
 
12.1%
france 314
 
11.1%
australia 185
 
6.6%
uk 144
 
5.1%
italy 113
 
4.0%
finland 92
 
3.3%
norway 85
 
3.0%
singapore 79
 
2.8%
canada 70
 
2.5%
Other values (9) 395
 
14.0%

Most occurring characters

ValueCountFrequency (%)
a 1936
13.6%
S 1513
10.6%
n 1296
 
9.1%
A 1244
 
8.7%
U 1148
 
8.1%
i 895
 
6.3%
r 890
 
6.2%
e 738
 
5.2%
p 525
 
3.7%
l 496
 
3.5%
Other values (23) 3560
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9266
65.1%
Uppercase Letter 4975
34.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1936
20.9%
n 1296
14.0%
i 895
9.7%
r 890
9.6%
e 738
 
8.0%
p 525
 
5.7%
l 496
 
5.4%
t 384
 
4.1%
c 314
 
3.4%
u 273
 
2.9%
Other values (10) 1519
16.4%
Uppercase Letter
ValueCountFrequency (%)
S 1513
30.4%
A 1244
25.0%
U 1148
23.1%
F 406
 
8.2%
K 144
 
2.9%
I 129
 
2.6%
N 85
 
1.7%
C 70
 
1.4%
D 63
 
1.3%
G 62
 
1.2%
Other values (3) 111
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 14241
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1936
13.6%
S 1513
10.6%
n 1296
 
9.1%
A 1244
 
8.7%
U 1148
 
8.1%
i 895
 
6.3%
r 890
 
6.2%
e 738
 
5.2%
p 525
 
3.7%
l 496
 
3.5%
Other values (23) 3560
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14241
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1936
13.6%
S 1513
10.6%
n 1296
 
9.1%
A 1244
 
8.7%
U 1148
 
8.1%
i 895
 
6.3%
r 890
 
6.2%
e 738
 
5.2%
p 525
 
3.7%
l 496
 
3.5%
Other values (23) 3560
25.0%
Distinct77
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size175.0 KiB
2023-06-24T09:01:16.079475image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length11
Median length9
Mean length6.4413744
Min length2

Characters and Unicode

Total characters18184
Distinct characters45
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYu
2nd rowHenriot
3rd rowDa Cunha
4th rowYoung
5th rowBrown
ValueCountFrequency (%)
freyre 259
 
9.1%
nelson 204
 
7.2%
young 115
 
4.0%
frick 91
 
3.2%
brown 88
 
3.1%
yu 80
 
2.8%
hernandez 70
 
2.5%
ferguson 55
 
1.9%
king 54
 
1.9%
labrune 53
 
1.9%
Other values (68) 1774
62.4%
2023-06-24T09:01:16.444532image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2297
 
12.6%
r 1850
 
10.2%
n 1769
 
9.7%
o 1355
 
7.5%
a 1137
 
6.3%
i 952
 
5.2%
s 759
 
4.2%
l 701
 
3.9%
u 647
 
3.6%
t 579
 
3.2%
Other values (35) 6138
33.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15229
83.7%
Uppercase Letter 2889
 
15.9%
Other Punctuation 46
 
0.3%
Space Separator 20
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2297
15.1%
r 1850
12.1%
n 1769
11.6%
o 1355
8.9%
a 1137
 
7.5%
i 952
 
6.3%
s 759
 
5.0%
l 701
 
4.6%
u 647
 
4.2%
t 579
 
3.8%
Other values (15) 3183
20.9%
Uppercase Letter
ValueCountFrequency (%)
F 458
15.9%
H 280
9.7%
N 247
 
8.5%
B 229
 
7.9%
Y 221
 
7.6%
K 192
 
6.6%
S 165
 
5.7%
C 162
 
5.6%
L 161
 
5.6%
T 147
 
5.1%
Other values (8) 627
21.7%
Other Punctuation
ValueCountFrequency (%)
' 46
100.0%
Space Separator
ValueCountFrequency (%)
20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18118
99.6%
Common 66
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2297
 
12.7%
r 1850
 
10.2%
n 1769
 
9.8%
o 1355
 
7.5%
a 1137
 
6.3%
i 952
 
5.3%
s 759
 
4.2%
l 701
 
3.9%
u 647
 
3.6%
t 579
 
3.2%
Other values (33) 6072
33.5%
Common
ValueCountFrequency (%)
' 46
69.7%
20
30.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18184
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2297
 
12.6%
r 1850
 
10.2%
n 1769
 
9.7%
o 1355
 
7.5%
a 1137
 
6.3%
i 952
 
5.2%
s 759
 
4.2%
l 701
 
3.9%
u 647
 
3.6%
t 579
 
3.2%
Other values (35) 6138
33.8%
Distinct72
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size173.9 KiB
2023-06-24T09:01:16.689025image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length10
Median length9
Mean length5.6680836
Min length3

Characters and Unicode

Total characters16001
Distinct characters43
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKwai
2nd rowPaul
3rd rowDaniel
4th rowJulie
5th rowJulie
ValueCountFrequency (%)
diego 259
 
9.0%
valarie 257
 
8.9%
julie 117
 
4.1%
sue 84
 
2.9%
michael 84
 
2.9%
juri 60
 
2.1%
maria 58
 
2.0%
peter 55
 
1.9%
elizabeth 55
 
1.9%
janine 53
 
1.8%
Other values (64) 1791
62.3%
2023-06-24T09:01:17.056842image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2067
12.9%
i 1890
11.8%
a 1875
 
11.7%
r 1069
 
6.7%
n 1049
 
6.6%
l 1017
 
6.4%
o 846
 
5.3%
t 573
 
3.6%
u 505
 
3.2%
J 420
 
2.6%
Other values (33) 4690
29.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13046
81.5%
Uppercase Letter 2873
 
18.0%
Space Separator 50
 
0.3%
Other Punctuation 32
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2067
15.8%
i 1890
14.5%
a 1875
14.4%
r 1069
8.2%
n 1049
8.0%
l 1017
7.8%
o 846
6.5%
t 573
 
4.4%
u 505
 
3.9%
g 391
 
3.0%
Other values (13) 1764
13.5%
Uppercase Letter
ValueCountFrequency (%)
J 420
14.6%
M 389
13.5%
V 363
12.6%
D 359
12.5%
A 220
7.7%
P 204
7.1%
S 146
 
5.1%
K 131
 
4.6%
E 121
 
4.2%
W 92
 
3.2%
Other values (8) 428
14.9%
Space Separator
ValueCountFrequency (%)
50
100.0%
Other Punctuation
ValueCountFrequency (%)
¡ 32
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15919
99.5%
Common 82
 
0.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2067
13.0%
i 1890
11.9%
a 1875
11.8%
r 1069
 
6.7%
n 1049
 
6.6%
l 1017
 
6.4%
o 846
 
5.3%
t 573
 
3.6%
u 505
 
3.2%
J 420
 
2.6%
Other values (31) 4608
28.9%
Common
ValueCountFrequency (%)
50
61.0%
¡ 32
39.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15969
99.8%
None 32
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2067
12.9%
i 1890
11.8%
a 1875
11.7%
r 1069
 
6.7%
n 1049
 
6.6%
l 1017
 
6.4%
o 846
 
5.3%
t 573
 
3.6%
u 505
 
3.2%
J 420
 
2.6%
Other values (32) 4658
29.2%
None
ValueCountFrequency (%)
¡ 32
100.0%

Interactions

2023-06-24T09:01:07.338534image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:03.211934image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:04.287453image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:05.311296image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:06.314742image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:07.568660image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:03.425840image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:04.499466image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:05.512512image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:06.534384image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:07.756576image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:03.619862image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:04.700605image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:05.692300image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:06.751647image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:07.954532image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:03.800518image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:04.875385image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:05.918152image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:06.911882image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:08.131952image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:04.068207image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:05.077858image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:06.143473image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-24T09:01:07.130230image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-06-24T09:01:17.208586image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
DATAQUANTITYORDEREDPRICEEACHSALESMONTH_IDQTR_IDYEAR_IDPRODUCTLINESTATECOUNTRY
DATA1.0000.043-0.0040.021-0.0120.8330.9540.0000.3360.257
QUANTITYORDERED0.0431.0000.0060.538-0.0260.1390.1930.0000.1080.041
PRICEEACH-0.0040.0061.0000.7880.0110.0230.0160.1460.0000.018
SALES0.0210.5380.7881.000-0.0020.0210.0560.1120.0000.000
MONTH_ID-0.012-0.0260.011-0.0021.0000.9990.4140.0380.3080.253
QTR_ID0.8330.1390.0230.0210.9991.0000.3800.0200.3560.236
YEAR_ID0.9540.1930.0160.0560.4140.3801.0000.0050.3140.224
PRODUCTLINE0.0000.0000.1460.1120.0380.0200.0051.0000.1800.157
STATE0.3360.1080.0000.0000.3080.3560.3140.1801.0000.996
COUNTRY0.2570.0410.0180.0000.2530.2360.2240.1570.9961.000

Missing values

2023-06-24T09:01:08.463471image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-24T09:01:08.807863image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-06-24T09:01:09.116858image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

DATAQUANTITYORDEREDPRICEEACHSALESORDERDATEQTR_IDMONTH_IDYEAR_IDPRODUCTLINEPHONEADDRESSLINE1CITYSTATEPOSTALCODECOUNTRYCONTACTLASTNAMECONTACTFIRSTNAME
0101073095.702871.002/24/2003 0:00122003Motorcycles2125557818897 Long Airport AvenueNYCNY10022USAYuKwai
1101213481.352765.9005-07-2003 00:00252003Motorcycles26.47.155559 rue de l'AbbayeReimsNaN51100FranceHenriotPaul
2101344194.743884.3407-01-2003 00:00372003Motorcycles+33 1 46 62 755527 rue du Colonel Pierre AviaParisNaN75508FranceDa CunhaDaniel
3101454583.263746.708/25/2003 0:00382003Motorcycles626555726578934 Hillside Dr.PasadenaCA90003USAYoungJulie
41015949100.005205.2710-10-2003 00:004102003Motorcycles65055513867734 Strong St.San FranciscoCANaNUSABrownJulie
5101683696.663479.7610/28/2003 0:004102003Motorcycles65055568099408 Furth CircleBurlingameCA94217USAHiranoJuri
6101802986.132497.7711-11-2003 00:004112003Motorcycles20.16.1555184, chausse de TournaiLilleNaN59000FranceRanceMartine
71018848100.005512.3211/18/2003 0:004112003Motorcycles+47 2267 3215Drammen 121, PR 744 SentrumBergenNaNN 5804NorwayOeztanVeysel
8102012298.572168.5412-01-2003 00:004122003Motorcycles65055557875557 North Pendale StreetSan FranciscoCANaNUSAMurphyJulie
91021141100.004708.441/15/2004 0:00112004Motorcycles(1) 47.55.655525, rue LauristonParisNaN75016FrancePerrierDominique
DATAQUANTITYORDEREDPRICEEACHSALESORDERDATEQTR_IDMONTH_IDYEAR_IDPRODUCTLINEPHONEADDRESSLINE1CITYSTATEPOSTALCODECOUNTRYCONTACTLASTNAMECONTACTFIRSTNAME
2813102933260.061921.9209-09-2004 00:00392004Ships011-4988555Via Monte Bianco 34TorinoNaN10100ItalyAccortiPaolo
2814103063559.512082.8510/14/2004 0:004102004Ships(171) 555-1555Fauntleroy CircusManchesterNaNEC2 5NTUKAshworthVictoria
2815103154055.692227.6010/29/2004 0:004102004Ships40.67.855567, rue des Cinquante OtagesNantesNaN44000FranceLabruneJanine
2816103273786.743209.3811-10-2004 00:004112004Ships31 12 3555Vinb'ltet 34KobenhavnNaN1734DenmarkPetersenJytte
2817103374297.164080.7211/21/2004 0:004112004Ships21255584935905 Pompton St.NYCNY10022USAHernandezMaria
28181035020100.002244.4012-02-2004 00:004122004Ships(91) 555 94 44C/ Moralzarzal, 86MadridNaN28034SpainFreyreDiego
28191037329100.003978.511/31/2005 0:00112005Ships981-443655Torikatu 38OuluNaN90110FinlandKoskitaloPirkko
28201038643100.005417.5703-01-2005 00:00132005Ships(91) 555 94 44C/ Moralzarzal, 86MadridNaN28034SpainFreyreDiego
2821103973462.242116.163/28/2005 0:00132005Ships61.77.65551 rue Alsace-LorraineToulouseNaN31000FranceRouletAnnette
2822104144765.523079.4405-06-2005 00:00252005Ships61755595558616 Spinnaker Dr.BostonMA51003USAYoshidoJuri